502 research outputs found

    Boosting Adversarial Transferability by Block Shuffle and Rotation

    Full text link
    Adversarial examples mislead deep neural networks with imperceptible perturbations and have brought significant threats to deep learning. An important aspect is their transferability, which refers to their ability to deceive other models, thus enabling attacks in the black-box setting. Though various methods have been proposed to boost transferability, the performance still falls short compared with white-box attacks. In this work, we observe that existing input transformation based attacks, one of the mainstream transfer-based attacks, result in different attention heatmaps on various models, which might limit the transferability. We also find that breaking the intrinsic relation of the image can disrupt the attention heatmap of the original image. Based on this finding, we propose a novel input transformation based attack called block shuffle and rotation (BSR). Specifically, BSR splits the input image into several blocks, then randomly shuffles and rotates these blocks to construct a set of new images for gradient calculation. Empirical evaluations on the ImageNet dataset demonstrate that BSR could achieve significantly better transferability than the existing input transformation based methods under single-model and ensemble-model settings. Combining BSR with the current input transformation method can further improve the transferability, which significantly outperforms the state-of-the-art methods

    Optimal Variable Speed Limit Control Strategy on Freeway Segments under Fog Conditions

    Full text link
    Fog is a critical external factor that threatens traffic safety on freeways. Variable speed limit (VSL) control can effectively harmonize vehicle speed and improve safety. However, most existing weather-related VSL controllers are limited to adapt to the dynamic traffic environment. This study developed optimal VSL control strategy under fog conditions with fully consideration of factors that affect traffic safety risks. The crash risk under fog conditions was estimated using a crash risk prediction model based on Bayesian logistic regression. The traffic flow with VSL control was simulated by a modified cell transmission model (MCTM). The optimal factors of VSL control were obtained by solving an optimization problem that coordinated safety and mobility with the help of the genetic algorithm. An example of I-405 in California, USA was designed to simulate and evaluate the effects of the proposed VSL control strategy. The optimal VSL control factors under fog conditions were compared with sunny conditions, and different placements of VSL signs were evaluated. Results showed that the optimal VSL control strategy under fog conditions changed the speed limit more cautiously. The VSL control under fog conditions in this study effectively reduced crash risks without significantly increasing travel time, which is up to 37.15% reduction of risks and only 0.48% increase of total travel time. The proposed VSL control strategy is expected to be of great use in the development of VSL systems to enhance freeway safety under fog conditions

    A Causal View of Entity Bias in (Large) Language Models

    Full text link
    Entity bias widely affects pretrained (large) language models, causing them to rely on (biased) parametric knowledge to make unfaithful predictions. Although causality-inspired methods have shown great potential to mitigate entity bias, it is hard to precisely estimate the parameters of underlying causal models in practice. The rise of black-box LLMs also makes the situation even worse, because of their inaccessible parameters and uncalibrated logits. To address these problems, we propose a specific structured causal model (SCM) whose parameters are comparatively easier to estimate. Building upon this SCM, we propose causal intervention techniques to mitigate entity bias for both white-box and black-box settings. The proposed causal intervention perturbs the original entity with neighboring entities. This intervention reduces specific biasing information pertaining to the original entity while still preserving sufficient semantic information from similar entities. Under the white-box setting, our training-time intervention improves OOD performance of PLMs on relation extraction (RE) and machine reading comprehension (MRC) by 5.7 points and by 9.1 points, respectively. Under the black-box setting, our in-context intervention effectively reduces the entity-based knowledge conflicts of GPT-3.5, achieving up to 20.5 points of improvement of exact match accuracy on MRC and up to 17.6 points of reduction in memorization ratio on RE. Our code is available at https://github.com/luka-group/Causal-View-of-Entity-Bias.Comment: Findings of EMNLP 202

    Adaptive Test-Time Personalization for Federated Learning

    Full text link
    Personalized federated learning algorithms have shown promising results in adapting models to various distribution shifts. However, most of these methods require labeled data on testing clients for personalization, which is usually unavailable in real-world scenarios. In this paper, we introduce a novel setting called test-time personalized federated learning (TTPFL), where clients locally adapt a global model in an unsupervised way without relying on any labeled data during test-time. While traditional test-time adaptation (TTA) can be used in this scenario, most of them inherently assume training data come from a single domain, while they come from multiple clients (source domains) with different distributions. Overlooking these domain interrelationships can result in suboptimal generalization. Moreover, most TTA algorithms are designed for a specific kind of distribution shift and lack the flexibility to handle multiple kinds of distribution shifts in FL. In this paper, we find that this lack of flexibility partially results from their pre-defining which modules to adapt in the model. To tackle this challenge, we propose a novel algorithm called ATP to adaptively learns the adaptation rates for each module in the model from distribution shifts among source domains. Theoretical analysis proves the strong generalization of ATP. Extensive experiments demonstrate its superiority in handling various distribution shifts including label shift, image corruptions, and domain shift, outperforming existing TTA methods across multiple datasets and model architectures. Our code is available at https://github.com/baowenxuan/ATP .Comment: Accepted by NeurIPS 202
    • …
    corecore